23 research outputs found

    Analysis on Partial Relationship in LOD

    Get PDF
    Relationships play a key role in Semantic Web to connect the dots between entities (concepts or instances) in a way that enables to absorb the real sense of the entities. Some interesting relationships would give proof for the existence of subject and object in triples which in tern can be defined as evidential relationships. Identifying evidential relationships will yield solutions to some existing inference problems and open doors for new applications and research. Part_of relationships are identified as a special kind of an evidential relationship out of membership, causality and etc. Linked Open data as a global data space would provide a good platform to explore these relationships and solve interesting inference problems. But this is not trivial because LOD does not have a rich schema in terms of the data sets and also the existing work with respect to schema mapping in LOD is limited to concepts and not relationships. This project is based on finding a novel approach to identify partial relationships which is the superset of part_of relationships from LOD instance data by conducting a proper analysis of the data patterns in instance data. Ultimately this approach would provide a way to enhance the shallow schemas in LOD which in tern would be helpful in schema matching in LOD. We apply the determined approach to the DBpedia data set in order to identify the partial relationships in DBpedia

    A Knowledge Graph Framework for Detecting Traffic Events Using Stationary Cameras

    Get PDF
    With the rapid increase in urban development, it is critical to utilize dynamic sensor streams for traffic understanding, especially in larger cities where route planning or infrastructure planning is more critical. This creates a strong need to understand traffic patterns using ubiquitous sensors to allow city officials to be better informed when planning urban construction and to provide an understanding of the traffic dynamics in the city. In this study, we propose our framework ITSKG (Imagery-based Traffic Sensing Knowledge Graph) which utilizes the stationary traffic camera information as sensors to understand the traffic patterns. The proposed system extracts image-based features from traffic camera images, adds a semantic layer to the sensor data for traffic information, and then labels traffic imagery with semantic labels such as congestion. We share a prototype example to highlight the novelty of our system and provide an online demo to enable users to gain a better understanding of our system. This framework adds a new dimension to existing traffic modeling systems by incorporating dynamic image-based features as well as creating a knowledge graph to add a layer of abstraction to understand and interpret concepts like congestion to the traffic event detection system

    Data Drift Monitoring for Log Anomaly Detection Pipelines

    Full text link
    Logs enable the monitoring of infrastructure status and the performance of associated applications. Logs are also invaluable for diagnosing the root causes of any problems that may arise. Log Anomaly Detection (LAD) pipelines automate the detection of anomalies in logs, providing assistance to site reliability engineers (SREs) in system diagnosis. Log patterns change over time, necessitating updates to the LAD model defining the `normal' log activity profile. In this paper, we introduce a Bayes Factor-based drift detection method that identifies when intervention, retraining, and updating of the LAD model are required with human involvement. We illustrate our method using sequences of log activity, both from unaltered data, and simulated activity with controlled levels of anomaly contamination, based on real collected log data

    Feedback-Driven Radiology Exam Report Retrieval with Semantics

    Get PDF
    Clinical documents are vital resources for radiologists to have a better understanding of patient history. The use of clinical documents can complement the often brief reasons for exams that are provided by physicians in order to perform more informed diagnoses. With the large number of study exams that radiologists have to perform on a daily basis, it becomes too time-consuming for radiologists to sift through each patient\u27s clinical documents. It is therefore important to provide a capability that can present contextually relevant clinical documents, and at the same time satisfy the diverse information needs among radiologists from different specialties. In this work, we propose a knowledge-based semantic similarity approach that uses domain-specific relationships such as part-of along with taxonomic relationships such as is-a to identify relevant radiology exam records. Our approach also incorporates explicit relevance feedback to personalize radiologists information needs. We evaluated our approach on a corpus of 6,265 radiology exam reports through study sessions with radiologists and demonstrated that the retrieval performance of our approach yields an improvement of 5% over the baseline. We further performed intra-class and inter-class similarities using a subset of 2,384 reports spanning across 10 exam codes. Our result shows that intra-class similarities are always higher than the inter-class similarities and our approach was able to obtain 6% percent improvement in intra-class similarities against the baseline. Our results suggest that the use of domain-specific relationships together with relevance feedback provides a significant value to improve the accuracy of the retrieval of radiology exam reports

    A Semantic Problem Solving Environment for Integrative Parasite Research: Identification of Intervention Targets for Trypanosoma cruzi

    Get PDF
    Effective research in parasite biology requires analyzing experimental lab data in the context of constantly expanding public data resources. Integrating lab data with public resources is particularly difficult for biologists who may not possess significant computational skills to acquire and process heterogeneous data stored at different locations. Therefore, we develop a semantic problem solving environment (SPSE) that allows parasitologists to query their lab data integrated with public resources using ontologies. An ontology specifies a common vocabulary and formal relationships among the terms that describe an organism, and experimental data and processes in this case. SPSE supports capturing and querying provenance information, which is metadata on the experimental processes and data recorded for reproducibility, and includes a visual query-processing tool to formulate complex queries without learning the query language syntax. We demonstrate the significance of SPSE in identifying gene knockout targets for T. cruzi. The overall goal of SPSE is to help researchers discover new or existing knowledge that is implicitly present in the data but not always easily detected. Results demonstrate improved usefulness of SPSE over existing lab systems and approaches, and support for complex query design that is otherwise difficult to achieve without the knowledge of query language syntax

    Domain-specific Knowledge Extraction from the Web of Data

    Get PDF
    Domain knowledge plays a significant role in powering a number of intelligent applications such as entity recommendation, question answering, data analytics, and knowledge discovery. Recent advances in Artificial Intelligence and Semantic Web communities have contributed to the representation and creation of this domain knowledge in a machine-readable form. This has resulted in a large collection of structured datasets on the Web which is commonly referred to as the Web of data. The Web of data continues to grow rapidly since its inception, which poses a number of challenges in developing intelligent applications that can benefit from its use. Majority of these applications are focused on a particular domain. Hence they can benefit from a relevant portion of the Web of Data. For example, a movie recommendation application predominantly requires knowledge of the movie domain and a biomedical knowledge discovery application predominantly requires relevant knowledge on the genes, proteins, chemicals, disorders and their interactions. Using the entire Web of data is both unnecessary and computationally intensive, and the irrelevant portion can add to the noise which may negatively impact the performance of the application. This motivates the need to identify and extract relevant data for domain-specific applications from the Web of data. Therefore, this dissertation studies the problem of domain-specific knowledge extraction from the Web of data. The rapid growth of the Web of data takes place in three dimensions: 1) the number of knowledge graphs, 2) the size of the individual knowledge graph, and 3) the domain coverage. For example, the Linked Open Data (LOD), which is a collection of interlinked knowledge graphs on the Web, started with 12 datasets in 2007, and has evolved to more than 1100 datasets in 2017. DBpedia, which is a knowledge graph in the LOD, started with 3 million entities and 400 million relationships in 2012, and now has grown up to 38:3 million entities and 3 billion relationships. As we are interested in domain-specific applications and the domain of interest is already known, we propose to use the domain to restrict/reduce the other two dimensions from the Web of data. Reducing the first dimension requires to reduce the number of knowledge graphs by identifying relevant knowledge graphs to the domain. However, this still may result in large knowledge graphs such as DBpedia, Freebase, and YAGO that cover multiple domains including our domain of interest. Hence, it is required to reduce the size of the knowledge graphs by identifying the relevant portion of a large knowledge graph. This leads to two key research problems to address in this dissertation. (1) Can we identify the relevant knowledge graphs that represent a domain? and (2) Can we identify the relevant portion of a cross-domain knowledge graphs to represent the domain? A solution to the first problem requires automatically identifying the domain represented by each knowledge graph. This can be challenging for several reasons: 1) Knowledge graphs represent domains at different levels of abstractions and specificity, 2) a single knowledge graph can represent multiple domains (i.e., cross-domain knowledge graphs), and 3) the represented domains by knowledge graphs keep evolving. We propose to use existing crowd-sourced knowledge bases with their schema to automatically identify the domains and show its effectiveness in finding relevant knowledge graphs for specific domains. The challenge in addressing the second issue is the nature of the relationships connecting entities in these knowledge graphs. There are two types of relationships: 1) Hierarchical relationships, and 2) non-hierarchical relationships. While hierarchical relationships connect in-domain and out-of-domain entities using the same relationship type and hence represent uniform semantics, nonhierarchical relationships c..

    Creating Real-Time Dynamic Knowledge Graphs

    Get PDF

    Alignment and Dataset Identification of Linked Data in Semantic Web

    Get PDF
    The Linked Open Data (LOD) cloud has gained significant attention in the Semantic Web community over the past few years. With rapid expansion in size and diversity, it consists of over 800 interlinked datasets with over 60 billion triples. These datasets encapsulate structured data and knowledge spanning over varied domains such as entertainment, life sciences, publications, geography, and government. Applications can take advantage of this by using the knowledge distributed over the interconnected datasets, which is not realistic to find in a single place elsewhere. However, two of the key obstacles in using the LOD cloud are the limited support for data integration tasks over concepts, instances, and properties, and relevant data source selection for querying over multiple datasets. We review, in brief, some of the important and interesting technical approaches found in the literature that address these two issues. We observe that the general purpose alignment techniques developed outside the LOD context fall short in meeting the heterogeneous data representation of LOD. Therefore, an LOD-specific review of these techniques (especially for alignment) is important to the community. The topics covered and discussed in this article fall under two broad categories, namely alignment techniques for LOD datasets and relevant data source selection in the context of query processing over LOD datasets

    A Systematic Property Mapping using Category Hierarchy and Data

    Get PDF
    Relationships play a key role in Semantic Web to connect the dots between entities (concepts or instances in a way that enables to absorb the real sense of the entities. Even though relationships are important, it is difficult to categorize or identify them because they consist of complex knowledge in the schema. Therefore systematically identifying relationships yield many advantages and open doors for new research avenues. In this work, we try to identify a specific type of relationship (part of) in a multi-domain dataset and devised an algorithm using Wikipedia to identify patterns of part of relationships in the dataset. This paper is based on some in progress initial work based on identifying part of relationships

    A Knowledge Graph Framework for Detecting Traffic Events Using Stationary Cameras

    Get PDF
    With the rapid increase in urban development, it is critical to utilize dynamic sensor streams for traffic understanding, especially in larger cities where route planning or infrastructure planning is more critical. This creates a strong need to understand traffic patterns using ubiquitous sensors to allow city officials to be better informed when planning urban construction and to provide an understanding of the traffic dynamics in the city. In this study, we propose our framework ITSKG (Imagery-based Traffic Sensing Knowledge Graph) which utilizes the stationary traffic camera information as sensors to understand the traffic patterns. The proposed system extracts image-based features from traffic camera images, adds a semantic layer to the sensor data for traffic information, and then labels traffic imagery with semantic labels such as congestion. We share a prototype example to highlight the novelty of our system and provide an online demo to enable users to gain a better understanding of our system. This framework adds a new dimension to existing traffic modeling systems by incorporating dynamic image-based features as well as creating a knowledge graph to add a layer of abstraction to understand and interpret concepts like congestion to the traffic event detection system
    corecore